The fate of Fulanus

Fulanus beltranus is a fictional species of lizard from Democratic Republic of Congo. We would like to find out how this species is going to do in different climate change scenarios. First let’s look at the points where it is currently known to occurr. To access the distribution data, we must load package Mapinguari.

library(Mapinguari)

Fulanus_bbox <- ggmap::make_bbox(lat = Lat, 
  lon = Lon, 
  data = FulanusDistribution,
  f = 0.2)

Fulanus_big <- ggmap::get_map(location = Fulanus_bbox, 
  source = "google", 
  maptype = "terrain")

ggmap::ggmap(Fulanus_big) +
  ggplot2::geom_point(data = FulanusDistribution, 
    mapping = ggplot2::aes(x = Lon, y = Lat), 
    size = 1, 
    colour = 'red')

function Distribution

It seems some points are in the ocean! Those must be wrong, let’s use Mapinguari’s Distribution fuction to clean them up.

# insert here point clean up and plot again

function Ecology

That is a terrible name, I wanted to call it “Environment”, but environment has a very specific meaning in R and it would be confusing.

How does the climate looks like in this area? We can use function Ecology to extract WorldClim surfaces for that area. The argument ext tells us which is the area you desire to crop your rasters around. It accepts either an Extent object with the coordinates of cropping limits, or a table of longitudes and latitudes, in which case it will grab the coordinates of the most extreme points and use those as the cropping limits. The argument margin adds to these limits, and it is on the same unit as the coordinates you supplied.

FulanusEcoRasters_present <-
   Ecology(
     raster_source = "/Users/gabriel/Documents/Mapinguari-development/global_grids_10_minutes",
     ext = FulanusDistribution,
     margin = 5,
     non_fixed_var = c('prec', 'tmin', 'tmax'),
     fixed_var = 'alt',
     years = c("present"),
     Phen_method = 'year',
     reorder = TRUE)

raster::plot(FulanusEcoRasters_present[[1]])

Ecology is able to download rasters from WorldClim, but you probably don’t want to download them everytime you are executing the function. That’s why the argument raster_source will also take a list of rasters from your workspace or a path to a folder containing those rasters.

However, in order for Mapinguari to recognize which variables correspond to each year or scenario, you have to name your folders and list elements following the convention variable_year_scenario. The defaul character for separating those terms is “_“, but you can change that, as long as you do the corresponding change to the argument separator in function. Some of your rasters will not be subject to scenarios, since they are measures of current climate, instead of projections. In that case you only have to name them variable_year, omitting the scenario term. The argument baseline identifies which years are not subject to scenarios, and the default is present and baseline, but it can be changed at your convenience. Some other variables are constant accross time, such as altitude. In that case you only have to write the name of the variable. Here is an example of how my folder is structured:

image:

I’m going to assign the path to that directory to an object:

my_directory <- "/Users/gabriel/Documents/Mapinguari-development/global_grids_10_minutes"

Why did I place some variables in the non-fixed argument and others in the fixed argument? Well, that will become relevant once we start making future projections. Variables on non_fixed are things such as climate, which you expect to change in different times and future scenarios, while variables in fixed are things such as geological features, which you expect to remain constant accross the time projected.

Another important argument for non-fixed variables is reorder, which will take the last two characters of your RasterLayer names, replace letters with zeros and order the layers in ascending order of those numbers. This is useful because of the way some of the WorldClim layers are named when you downloaded, which when stacked will be placed out of chronological order. This argument fixes that. But be careful if you are using layers with different names than the ones from WorldClim. If they are in the correct order when you stack them, reorder could scramble them, so set it to FALSE.

Let’s try doing some projections for the future years of 2050 and 2070:

FulanusEcoRasters_future <-
   Ecology(
     raster_source = my_directory,
     ext = FulanusDistribution,
     margin = 5,
     non_fixed_var = c('prec', 'tmin', 'tmax'),
     years = c('2050', '2070'),
     scenarios = c('rcp26', 'rcp45', 'rcp85'),
     Phen_method = 'year',
     reorder = TRUE)

raster::plot(raster::stack(
  FulanusEcoRasters_future$`2050_rcp26`$tmax_year_mean, 
  FulanusEcoRasters_future$`2050_rcp45`$tmax_year_mean, 
  FulanusEcoRasters_future$`2050_rcp85`$tmax_year_mean))

We might want to derive more complex variables that might be relevant to this species, such as Potential EvapoTranspiration (PET), Actual EvapoTranspiration (AET) or Climatic Water Deficit (CWD). Since those variables are calculated from the WorldClim variables, you need to supply those dependencies on raster_source. make table of derivable variables and dependencies. You also have to set the argument derive to TRUE, in order for the function to understand you are not trying to fetch the derived variables from raster_source (you can derive them the first time, save and load them from the directory like the other variables).

FulanusEcoRasters_derived <-
   Ecology(
     raster_source = my_directory,
     ext = FulanusDistribution,
     margin = 5,
     non_fixed_var = c('tmin', 'tmax', 'prec', 'PET', 'AET', 'CWD'),
     years = c('present', '2050', '2070'),
     scenarios = c('rcp26', 'rcp45', 'rcp85'),
     Phen_method = 'year',
     reorder = TRUE,
     derive = TRUE)

raster::plot(raster::stack(
  FulanusEcoRasters_derived$present$PET_year_mean, 
  FulanusEcoRasters_derived$present$AET_year_mean, 
  FulanusEcoRasters_derived$present$CWD_year_mean))

You might have noticed the argument Phen_method, which in our examples is set to ‘year’. That means the non-fixed variables, which are usually in monthly resolution, will be averaged accross the year. If you want to preserve them in monthly resolution, you can set it to month. There is another setting, which is season, that I will describe in more detail when talking about the Phenology function.

function Physiology

This function main goal is to create models of the relationship between a physiological measure and a environmental one, and output a closure so the physiological measure can be easily extrapolated to other environmental data, such as spatial or temporal data. It also evaluates the models, allowing the users to compare different parameterizations. For now, the function wraps mgcv::gamm() and nlme::nlme(), allowing for flexible additive models and rigorous non-linear mixed models. You can get creative and use the functions for measures other than physiology. For example, you might want to fit a logisitic model of reproductive state against environment, which can be useful as an input for function Phenology.

Back to Fulanus, we measured the maximum running speed of several individuals under different temperatures. We also recorded their body sizes, since that is a variable that is very likely to inffluence their performance. Here is how the data look like:

head(FulanusPhysiology)
##             species   id temp performance  size    site acctemp hydration
## 1 Fulanus_beltranus 2412    3       22.26 1.994 Cafundo      10     0.825
## 2 Fulanus_beltranus 2015    3       24.30 3.327 Cafundo      10     0.825
## 3 Fulanus_beltranus 2122    3       25.02 2.658 Cafundo      10     0.825
## 4 Fulanus_beltranus 2220    3       25.44 2.346 Cafundo      10     0.825
## 5 Fulanus_beltranus 2022    3       26.04 3.006 Cafundo      10     0.775
## 6 Fulanus_beltranus 2212    3       26.82 2.504 Cafundo      10     0.775
# I will multiply temperatures by 10 so they are on the same scale as the rasters
FulanusPhysiology$temp <- FulanusPhysiology$temp * 10

head(FulanusPhysiology)
##             species   id temp performance  size    site acctemp hydration
## 1 Fulanus_beltranus 2412   30       22.26 1.994 Cafundo      10     0.825
## 2 Fulanus_beltranus 2015   30       24.30 3.327 Cafundo      10     0.825
## 3 Fulanus_beltranus 2122   30       25.02 2.658 Cafundo      10     0.825
## 4 Fulanus_beltranus 2220   30       25.44 2.346 Cafundo      10     0.825
## 5 Fulanus_beltranus 2022   30       26.04 3.006 Cafundo      10     0.775
## 6 Fulanus_beltranus 2212   30       26.82 2.504 Cafundo      10     0.775

Note that we also attributed a number to each lizard to keep track of which lizard did which trial. That is important because, since the same lizard ran different trials, the data points are not independent and we have to account for that when building our model.

The function is simple and accepts any argument from mgcv::gamm() or nlme::nlme():

perf_functions <-
  Physiology(formula = performance ~ s(temp, bs = 'cs') + size,
    data = FulanusPhysiology,
    type = 'GAMM',
    random = list(id = ~ 1)
  )

type indicates which function will you use to build your model, it accepts the values ‘GAMM’ and ‘NLME’ (I didn’t use method because that was an argument in nlme). formula, data and random are the same as in mgcv::gamm(), and you can find more detail about those at their package documentation. formula will describe the structure of your model, including smooth terms (for method ‘GAMM’) and nonlinear terms (for method ‘NLME’). data is simply a data frame containing the variables specified in the formula. random accounts for the structure of random effects in your mixed model.

The function outputs a table containing informations about each model fitted and a closure to be used for predictions. You can access the closures and statistics by subsetting the table, as shown below.

perf_functions$formula$model_1
## performance ~ s(temp, bs = "cs") + size
perf_functions$stats$model_1
## $AIC
## [1] 6440.933
## 
## $BIC
## [1] 6459.027
## 
## $logLik
## 'log Lik.' -3216.467 (df=4)
perf_functions$output$model_1
## $lme
## Linear mixed-effects model fit by maximum likelihood
##   Data: strip.offset(mf) 
##   Log-likelihood: -3216.467
##   Fixed: y ~ X - 1 
## X(Intercept)        Xsize 
##     66.05077      4.72055 
## 
## Random effects:
##  Formula: ~Xr - 1 | g
##  Structure: pdIdnot
##              Xr1      Xr2      Xr3      Xr4      Xr5      Xr6      Xr7
## StdDev: 20.41074 20.41074 20.41074 20.41074 20.41074 20.41074 20.41074
##              Xr8      Xr9 Residual
## StdDev: 20.41074 20.41074  26.2192
## 
## Number of Observations: 681
## Number of Groups: 1 
## 
## $gam
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## performance ~ s(temp, bs = "cs") + size
## 
## Estimated degrees of freedom:
## 8.13  total = 10.13 
## 
## 
## attr(,"class")
## [1] "gamm" "list"
my_tpc <- perf_functions$predict$model_1

my_tpc(temp = 20, size = 10)
## [1] 69.08478
gam_for_plot <- perf_functions$output$model_1

plot(gam_for_plot$gam, 
  pages = 1, 
  residuals = TRUE, 
  pch = 19, 
  seWithMean = TRUE, 
  shade = TRUE, 
  shade.col = "gray")

gam_for_plot$gam$data <- FulanusPhysiology

pred_plot <- visreg::visreg(gam_for_plot$gam, "temp", type = "conditional", plot = FALSE)
pred_plot$fit$performance <- mgcv::predict.gam(object = gam_for_plot$gam, newdata = pred_plot$fit)

# The ggplot:
ggplot2::ggplot(pred_plot$fit, ggplot2::aes(x = temp, y = performance)) + 
  ggplot2::geom_line() +
  ggplot2::geom_line(ggplot2::aes(y = visregLwr), linetype = "dashed") + 
  ggplot2::geom_line(ggplot2::aes(y = visregUpr), linetype = "dashed") +
  ggplot2::geom_jitter(FulanusPhysiology, mapping = ggplot2::aes(x = temp, y = performance, size = size, colour = id)) +
  ggplot2::scale_size_continuous(range = c(2,4)) +
  ggplot2::geom_rug() +
  ggplot2::xlab("Temperature °C") +
  ggplot2::ylab("Speed m/s") + 
  ggplot2::theme(legend.position = "none") +
  ggplot2::ylim(0, 175)
## Warning: Removed 6 rows containing missing values (geom_path).
## Warning: Removed 8 rows containing missing values (geom_path).
## Warning: Removed 1 rows containing missing values (geom_path).
## Warning: Removed 7 rows containing missing values (geom_point).

This closure can be combined with the output from Ecology to create a raster of physiology, which varies according to local environmental conditions. This is the subject of the next function.

EcoPhysiology

EcoPhysiology does the link between your physiological model and your spatial environmental data. It will apply any function to each year/scenario combination, as long as the variables necessary for the model are present. Sometimes you might name variables differently on your model and on your rasters. To deal with this situation, the argument Perf_args acts as a dictionary. It is a named list, in which the names are the variables in the model and the arguments are character strings of the variables in the rasters. You might also supply a numerical constant instead of a character string. That way, this variable will be set to that value for all calculations. That might be useful for example, if your model contain individual covariates such as body size, and you want to set a standard value for every calculation, for example, average body size.

FulanusEcoRasters_month <-
   Ecology(
     raster_source = my_directory,
     ext = FulanusDistribution,
     margin = 5,
     non_fixed_var = c('tmax', 'prec'),
     years = 'present',
     Phen_method = 'month',
     reorder = TRUE,
     derive = TRUE)


Perf_rasters_month <-
  EcoPhysiology(raster_source = FulanusEcoRasters_month,
    Perf_args = list(temp = 'tmax', size = mean(FulanusPhysiology$size)),
    separator = '_',
    PerfFUN = my_tpc)

raster::plot(Perf_rasters_month[[1]][[1]])

Phenology

However, we know that the most crucial period for Fulanus population dynamics is their breeding period. They wil breed only on certain climatic conditions, what will cause their breeding period to vary spatially and temporally. In order to account for that, function Phenology partition month rasters according to the specified breeding period. You can input the starting and ending month of the phenological event with arguments StartSeason and StopSeason. This will make the partitioning uniform inside and accross rasters. If you want to account for spatial and temporal variation, you have to use argument PhenFUN, which accepts functions relating for local conditions to a binary representation of if the organisms are experiencing the phenological event or not. It could be a logistic model or a function setting hard thresholds, for example. The function Phenology can also be called from inside Ecology or EcoPhysiology, partitioning the rasters as they are output. Phenology has also a Phen_args argument, which works in the same way as Perf_args described above

# Fix this later: simulate data on breeding seasons and environmental conditions, then fit this model

PhenFUN1 <- function(x) round(1/(1 + exp((150 - x)/2)), 2)

Seasons_by_rain <-
  lapply(FulanusEcoRasters_month, 
    Phenology,
    PhenFUN = PhenFUN1 ,
    Phen_args = list(x = 'prec'))

raster::plot(Seasons_by_rain$present$Season_length)

raster::plot(Seasons_by_rain$present$tmax_Season)

raster::plot(Seasons_by_rain$present$tmax_NonSeason)

# Let's say we are only interested on the periods when performance is bigger than 80.

PhenFUN2 <- function(x) ifelse(x > 80, 1, 0)

Seasons_by_performance <-
  lapply(Perf_rasters_month, 
    Phenology,
    PhenFUN = PhenFUN2 ,
    Phen_args = list(x = 'Perf'))

raster::plot(Seasons_by_performance$present$Season_length)

raster::plot(Seasons_by_performance$present$Perf_Season)

raster::plot(Seasons_by_performance$present$Perf_NonSeason)

Note that the season by performance maps have blank spots both in the Season and NonSeason rasters, while the season by rain maps have blank spots only on the Season raster. Why is that? Let’s look at the values for the season length at the present for both the Seasons_by_rain and Seasons_by_performance.

Seasons_by_rain$present$Season_length
## class       : RasterLayer 
## dimensions  : 75, 119, 8925  (nrow, ncol, ncell)
## resolution  : 0.1666667, 0.1666667  (x, y)
## extent      : 17.16667, 37, -4, 8.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       : Season_length 
## values      : 0, 9.07  (min, max)
Seasons_by_performance$present$Season_length
## class       : RasterLayer 
## dimensions  : 75, 119, 8925  (nrow, ncol, ncell)
## resolution  : 0.1666667, 0.1666667  (x, y)
## extent      : 17.16667, 37, -4, 8.5  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## data source : in memory
## names       : Season_length 
## values      : 0, 12  (min, max)

You can see that the season by rain raster ranges from 0 to 9, that means that the species is unable to breed in some places, and the places where it breeds for longer, the breeding season lasts 9 months. While the season by performance raster ranges from 0 to 12, that means there are places where the species can’t breed and places where the species can breed all year long. That shows on the map as the blank spots. If the values are 0 on the Season raster, it means the species can’t breed in that location, if they are 0 in the NonSeason raster, it means they breed all year long in that location.

#load libraries
library(raster)
## Loading required package: sp
library(ggplot2)
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:raster':
## 
##     extract
 prec_during_breeding <- Seasons_by_rain$present$prec_Season
 tmax_during_breeding <- Seasons_by_rain$present$tmax_Season
 season_length_breeding <- Seasons_by_rain$present$Season_length

map_ext <-
  prec_during_breeding %>% 
  raster::extent() %>% 
  as.vector()

names(map_ext) <- c("left", "right", "bottom", "top")

Fulanus_big <- ggmap::get_map(location = map_ext, 
  source = "stamen", 
  maptype = "toner-hybrid")
## Map from URL : http://tile.stamen.com/toner-hybrid/6/35/30.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/36/30.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/37/30.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/38/30.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/35/31.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/36/31.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/37/31.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/38/31.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/35/32.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/36/32.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/37/32.png
## Map from URL : http://tile.stamen.com/toner-hybrid/6/38/32.png
Fulanus_big
## 570x902 toner-hybrid map image from Stamen Maps.  see ?ggmap to plot it.
borders <- ggmap::ggmap(Fulanus_big)

#open ASCII file using ‘raster’ command, which converts the ASCII to a raster object
map <- prec_during_breeding

#convert the raster to points for plotting
map.p <- rasterToPoints(map)

#Make the points a dataframe for ggplot
prec_df <- data.frame(map.p)
#Make appropriate column headings
colnames(prec_df) <- c("Longitude", "Latitude", "MAP")


#open ASCII file using ‘raster’ command, which converts the ASCII to a raster object
map <- tmax_during_breeding

#convert the raster to points for plotting
map.p <- rasterToPoints(map)

#Make the points a dataframe for ggplot
tmax_df <- data.frame(map.p)
#Make appropriate column headings
colnames(tmax_df) <- c("Longitude", "Latitude", "MAP")

#open ASCII file using ‘raster’ command, which converts the ASCII to a raster object
map <- season_length_breeding

#convert the raster to points for plotting
map.p <- rasterToPoints(map)

#Make the points a dataframe for ggplot
length_df <- data.frame(map.p)
#Make appropriate column headings
colnames(length_df) <- c("Longitude", "Latitude", "MAP")

#Call in point data, in this case a fake transect (csv file with lat and lon coordinates)
sites <- FulanusDistribution

#Now make the map
prec_plot <-
  borders +
  geom_raster(data = prec_df, aes(y = Latitude, x = Longitude, fill = MAP)) +
  geom_point(data = sites, aes(x = Lon, y = Lat), color = "white", size = 3, shape = 4) +
  theme_bw() +
  coord_equal() +
  scale_fill_gradientn(colours = rev(rainbow(4)), breaks = c(140, 200, 280, 340)) +
  theme(axis.title.x = element_text(size = 16),
    axis.title.y = element_text(size = 16, angle = 90),
    axis.text.x = element_text(size = 14),
    axis.text.y = element_text(size = 14),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    legend.position = "right",
    legend.key = element_blank()
  )

prec_plot$layers <- rev(prec_plot$layers)

tmax_plot <-
  borders +
  geom_raster(data = tmax_df, aes(y = Latitude, x = Longitude, fill = MAP)) +
  geom_point(data = sites, aes(x = Lon, y = Lat), color = "white", size = 3, shape = 4) +
  theme_bw() +
  coord_equal() +
  scale_fill_gradientn(colours = rev(rainbow(4)), breaks = c(140, 200, 280, 340)) +
  theme(axis.title.x = element_text(size = 16),
    axis.title.y = element_text(size = 16, angle = 90),
    axis.text.x = element_text(size = 14),
    axis.text.y = element_text(size = 14),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    legend.position = "right",
    legend.key = element_blank()
  )

tmax_plot$layers <- rev(tmax_plot$layers)

length_plot <-
  borders +
  geom_raster(data = length_df, aes(y = Latitude, x = Longitude, fill = MAP)) +
  geom_point(data = sites, aes(x = Lon, y = Lat), color = "white", size = 3, shape = 4) +
  theme_bw() +
  coord_equal() +
  scale_fill_gradientn(colours = rev(rainbow(4)), breaks = c(140, 200, 280, 340)) +
  theme(axis.title.x = element_text(size = 16),
    axis.title.y = element_text(size = 16, angle = 90),
    axis.text.x = element_text(size = 14),
    axis.text.y = element_text(size = 14),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    legend.position = "right",
    legend.key = element_blank()
  )

length_plot$layers <- rev(length_plot$layers)

length_plot

prec_plot

tmax_plot